Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support binary case #215

Merged
merged 8 commits into from
Aug 23, 2022
Merged

Support binary case #215

merged 8 commits into from
Aug 23, 2022

Conversation

AllenDowney
Copy link
Contributor

@AllenDowney AllenDowney commented Aug 22, 2022

Picking up where #200 left off...

If a user provides only two labels, we check that they are mutually exclusive and, if so, train the binary model. We log which column we're keeping.

Outstanding:

  • tests (several of the existing tests hit the binary case)
  • documentation

Note: I gave this a quick try on a dataset of 100 videos balanced evenly between blank and non blank and we indeed see some learning.

      Validate metric             DataLoader 0
─────────────────────────────────────────────────────────
species/val_accuracy/blank            0.75
   species/val_f1/blank        0.7826086956521738
species/val_precision/blank    0.6923076923076923
 species/val_recall/blank              0.9
       val_accuracy                   0.75
         val_loss              0.6593988537788391
       val_macro_f1            0.7442455242966751

Some downsides to the binary case is that these metrics can look misleading when the class are imbalanced. Not sure the best way to warn users about that. If users provide highly imbalanced data, models may learn problematically to only predict the default class, but I suppose that is the same in the multilabel case.

@netlify
Copy link

netlify bot commented Aug 22, 2022

Deploy Preview for silly-keller-664934 ready!

Name Link
🔨 Latest commit 22b3ff4
🔍 Latest deploy log https://app.netlify.com/sites/silly-keller-664934/deploys/63051d3f3d0d8d00098edf79
😎 Deploy Preview https://deploy-preview-215--silly-keller-664934.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@github-actions
Copy link
Contributor

github-actions bot commented Aug 22, 2022

@ejm714 ejm714 mentioned this pull request Aug 22, 2022
2 tasks
Copy link
Collaborator

@ejm714 ejm714 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good, not approving yet since we'll want a test for the binary case too.

zamba/models/config.py Outdated Show resolved Hide resolved
zamba/models/config.py Outdated Show resolved Hide resolved
zamba/models/config.py Show resolved Hide resolved
@AllenDowney
Copy link
Contributor Author

Looks like the failing test is

ImportError: Densepose not installed. See: https://zamba.drivendata.org/docs/stable/models/densepose/#installation

Seems like we need to either install densepose in the test environment or set ZAMBA_RUN_DENSEPOSE_TESTS=0

@AllenDowney
Copy link
Contributor Author

Looks like this was broken a few days ago: https://github.com/drivendataorg/zamba/runs/7904627719?check_suite_focus=true

@codecov-commenter
Copy link

codecov-commenter commented Aug 23, 2022

Codecov Report

Merging #215 (25b17c7) into master (e697afc) will increase coverage by 0.0%.
The diff coverage is 92.3%.

❗ Current head 25b17c7 differs from pull request most recent head 22b3ff4. Consider uploading reports for the commit 22b3ff4 to get more accurate results

@@          Coverage Diff           @@
##           master    #215   +/-   ##
======================================
  Coverage    86.9%   87.0%           
======================================
  Files          29      29           
  Lines        1910    1918    +8     
======================================
+ Hits         1661    1669    +8     
  Misses        249     249           
Impacted Files Coverage Δ
zamba/models/config.py 97.2% <92.3%> (+<0.1%) ⬆️

@ejm714 ejm714 self-requested a review August 23, 2022 19:01
Copy link
Collaborator

@ejm714 ejm714 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, @AllenDowney you can merge once the tests finish passing

@AllenDowney AllenDowney merged commit ebc698e into master Aug 23, 2022
@AllenDowney AllenDowney deleted the binary-classifier branch August 23, 2022 19:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants